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Conventional studies of biomolecular behaviors rely largely on the construction of kinetic schemes. Since 
the selection of these networks is not unique, a concern is raised whether and under which conditions hier¬ 
archical schemes can reveal the same experimentally measured fluctuating behaviors and unique fluctuation 
related physical properties. To clarify these questions, we introduce stochasticity into the traditional lumping 
analysis, generalize it from rate equations to chemical master equations and stochastic differential equations, 
and extract the fluctuation relations between kinetically and thermodynamically equivalent networks under 
intrinsic and extrinsic noises. The results provide a theoretical basis for the legitimate use of low-dimensional 
models in the studies of macromolecular fluctuations and, more generally, for exploring stochastic features in 
different levels of contracted networks in chemical and biological kinetic systems. 


I. INTRODUCTION 

Kinetic schemes are widely used for studying the 
thermodynamic, dynamic, and stochastic properties of 
macromoleculesi. These schemes are usually selected to 
be as simple as possible, such as the 2-state schemes for 
the bound and unbound states of enzymes or receptors 
and the open and closed states of ion channels. Nev¬ 
ertheless, they can also be rather sophisticated (e.g., 
8-state inositol trisphosphate receptors^, the 10-state 
hemoglobin^, and the 56-state chloride channels^). The 
selection of kinetic schemes is mainly determined by the 
desired accuracy and the measurable quantities^. Since 
a low-dimensional scheme can usually be contracted from 
higher-dimensional ones, there exists a cascade of hier¬ 
archical Markovian network models suitable for describ¬ 
ing the time evolution of the populations of a macro¬ 
molecule’s functional states^. These networks are an¬ 
ticipated to have indistinguishable kinetics, exhibiting 
identical mean trajectories after being projected to the 
low-dimensional network space. However, models with 
indistinguishable means do not necessarily have indis¬ 
tinguishable fluctuations. A question that arises is that 
which schemes will give more relevant fluctuations to a 
real system and under which conditions unique fluctua¬ 
tion features can be obtained from different levels of con¬ 
tracted schemes? These issues are essential for the reli¬ 
ability of various biological properties derived in terms 
of the fluctuations of a selected kinetic scheme, such 
as chemoreception^i^, membrane conductance^, and ion 
channel density^. 

The inter-network fluctuation relations arise from a 
comparison between different coarse-grained dynamical 
systems. It resembles the comparison between different 
rate equations in the lumping analysis, widely used in 
systems biology and general chemical engineeringiic— . 
A central issue in that analysis is finding the lumping 
conditions for eliminating unimportant events or time 
scales in a large network, of typically over 10"^ species 
in systems biology, to reduce its complexityi^. Inter¬ 
estingly, this contraction is mathematically analogous to 


merging experimentally indistinguishable states to obtain 
simple transition networks for the conformational change 
of a macromolecule. For instance, the Hodgkin-Huxley 
potassium ion channel has 16 configurations depending 
on whether its individual four gates are open or closed^. 
However, this channel is often regarded as a 2-state sys¬ 
tem, described by whether or not ions can pass through 
it in a patch-clamp recording. The contraction from a 
16-state to a 2-state model is because the gating current 
recording is incapable of resolving the detailed structure 
of the channel configuration. In terms of lumping analy¬ 
sis, this contraction is an approximate lumping^. 


Despite that correspondence, the original lumping 
analysis focuses on the relations between mean dynam¬ 
ics and is not concerned with fluctuations. To extract 
this stochastic component, we generalize the lumping 
theory from original rate equations (RE) to chemical 
master equations (CME) and stochastic differential equa¬ 
tions (SDE) and study kinetically equivalent (KE) and 
thermodynamically equivalent (TE) hierarchical kinetic 
schemes, under intrinsic and extrinsic noises. The re¬ 
sults go beyond the conventional assumption of “fast re¬ 
laxations” and contribute to our understanding of why 
a kinetic system can be contracted. In the case of ex¬ 
trinsic noise, different kinetic schemes can give different 
fluctuations even when their average trajectories are the 
same. This opens a possibility of identifying a correct ki¬ 
netic model by observing fluctuations. Notably, lumping 
conditions here are used for generating complex KE or 
TE networks from simple networks, in opposite to their 
original goal of reducing complex networks to simple net¬ 
works. Furthermore, for the conformational change of 
macromolecules discussed below, it is sufficient to focus 
on linear REs and linear lumping transformations. 
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II. LUMPING RATE EQUATIONS 


Let system A be an n-dimensional kinetic scheme de¬ 
scribed by the linear RE, 

^=MN or (1) 

i=i 

where JVi is the population of the *-th state and may 
represent the mean dynamics of some stochastic pro¬ 
cesses discussed later, M denotes the matrix of rate 
constants kij from states i to j, with ku = 0, and 
N = [A^i, N 2 ,..., JVn]^ represents a state vector, in which 
the superscript T stands for the transpose of a vector. 
If U is an n' X n full rank lumping matrix (n' < n), N 
can be contracted into an n'-dimensional vector N' = 
via 

N' = UN, (2) 

which is the state vector of some reduced system A. If 
each column of U is a standard unit vector, U denotes 
a proper lumping (see the example in SI—). Since all 
lumpings in the following discussions are “proper,” this 
term will be neglected below. The RE which N' satishes 
is generally an integral-differential equation with a mem¬ 
ory kernel^. If that kernel vanishes, the RE has a simple 
autonomous form as (1), 


dN' 

dt 


= M'N' 


or 


dt 


n 


E- 

6=1 


= ^kLN'-kUK, (3) 


with k'^^ = 0, and network A is called “exactly 
lumpable.” Exact lumping makes the contracted sys¬ 
tem of an autonomous system again autonomous, self- 
contained, and not having a memory kernel. If the 
memory kernel does not vanish but is small, A is called 
“approximately lumpable,” which has a broad practical 
applicatioiiii. Exact lumping is the limiting case of all 
approximate lumpings when the memory effect tends to 
zero. Equations (2) and (3) together constitute the KE 
condition between A and A', or the condition for which 
A can be exactly lumped into A'. Notice that (2) alone 
is insufficient for this condition, because any U can lump 
N into some N', which is not necessarily self-contained. 

Quantitatively, the KE condition between A and A' 
can be expressed by their rate constant matrices 


UM = M'U, (4) 

which implies When U is used to lump 

A into A', the n states in A are first partitioned into n' 
sets S'a, with a = 1, by the row vectors of U (see 
SI—). Then, all states in Sa are merged as the state a in 
A' and termed “the internal states” of a. Using the same 
procedure to merge all states in Sa on both sides of (1), 
one obtains the KE condition in terms of rate constants 


^ab ~ 'y ] 

jeSb 


for any a, b G {1, 2,..., n'} with a ^ b and any i G Sa, in 
analogy to that known for finite Markov chains^. Notice 
that the KE condition is fulfilled only when (5) is satisfied 
for all i G Sa- In brief, the KE condition can be expressed 
as (4) or (5), or equivalently as (2) together with (3). 

Since (5) does not demand fast relaxations between 
the internal states in Sa, the existence of fast variables 
or large kij is not the prerequisite for exact lumpability. 
However, lumping analysis can also eliminate fast vari¬ 
ables, as the quasi-equilibrium or quasi-steady-state ap¬ 
proximations Given a U, whether A described 

by (1) can be exactly lumped into A' by U is decided 
by whether A' has an autonomous RE (3), as discussed 
above. If two autonomous A and A' are given first in¬ 
stead, whether A can be lumped into A' is decided by 
whether some U can be found to connect them by (5). If 
such U exists, N' of A' and N of H are indistinguishable, 
in that the trajectories N' and UN are identical. 


III. LUMPING MASTER EQUATIONS 


To extract the fluctuation relations of intrinsic noises 
between hierarchical networks, we extend the lumping 
analysis from the RE (1) to its CME. Suppose a macro¬ 
molecule has n conformational states whose transition 
network A obeys the kinetic equation (1). If a system 
consists of N macromolecules, its CM E^°i^^ , 


dV 

dt 


dPfi (t) 

dt 


— LP or 

n 

^ % [{Nb + l)Pf,_a.Jt) - N^P^it) 


( 6 ) 


describes the evolution of the joint probability ^^(t) of 
finding the state vector N = [Ni,N 2 , at time t, 

where Ni > 0 is the number of macromolecules in the i-th 
state and Therein, N is related to the N 

in (1) by NiP^{t) = Ni, where the sum runs over all 
accessible N. The vector uiij has values —1 and 4-1 in its 
i-th and j-th components, respectively, and 0 elsewhere. 
It stands for the change of molecule numbers in different 
states during the reaction shifting one molecule from i to 
j. Notice that P is a vector whose “N-th” component is 
the probability Pf^{t), just as N in (1) is a vector whose 
j-th component is Ni. 

For each lumping matrix U, which contracts N of H 
into N' = UN of A!, there exists an associated lumping 
operator U, which contracts P into a reduced vector 


P' = UP, 

whose N'-th component is (see 83^^) 


( 7 ) 


N c=i V feeSc / 


( 5 ) 


( 8 ) 






3 


where sets Sc are partitioned by U as explained in the 
text that follows (4) and 6{X' — X) is a Kronecker delta 
whose value is one when X' = X and zero elsewhere. If 
U is arbitrary, N' does not necessarily obey a simple RE 
as (3) and (t) does not necessarily satisfy any CME 
of the same form as (6). However, if U can exactly lump 
A into A', N' does follow (3) and (0 indeed obeys a 
simple lumped CME 


. /T,/ 

— = L'P' or 
dt 

f fc' 

dt ^ 

a,b—l 


(9) 


(iv: + i)p^,_^^(t)-iv:pi,(<) 


which turns out to be the CME of A' (see S3i^). Alter¬ 
natively, suppose the REs of A and A' are (1) and (3) 
and some U can exactly lump A into A' through (2). 
Then their P and P' in (6) and (9) are related by (7) 
and thus indistinguishable from each other, which is the 
exact lumpability in terms of joint probabilities. Just 
as (2) and (3) form the KE condition between two REs, 
(7) and (9) constitute the KE condition on the level of 
CME. With the same argument as for (4), the lumping 
condition for the CME is 


UL = L'U. (10) 

Notably, the exactly lumped CME (9) via the KE condi¬ 
tion is distinct from the reduced CME entirely based on 
the time scale separations^. 

The above argument indicates that the exact lumpabil¬ 
ity in RE (1) implies the exact lumpability in its CME (6) 
and vice versa (see S2i^). Therefore, the KE condition 
is a rather strong condition for systems under intrinsic 
noises. It not only conveys the original meaning of iden¬ 
tical first moments, UN and N', but also the identities 
of all other moments, owing to the identity of probabil¬ 
ities, UP = P', (see (2.13) in S2S^). Physically it indi¬ 
cates that experimentally measured fluctuations cannot 
be used for judging whether a state has internal states, if 
the fluctuations are caused by small numbers of macro¬ 
molecules. 

Among all moments, of special interest are the indis¬ 
tinguishable second moments, 

a' = U5-U^, (11) 

where Uij = (SNiSNj) (ct),^ = (JA'JN^)) is an average 
over the probability Pf;^{t) {P^,{t)) and SNi = — Ni 

{6N'^ = N' — N^) is the fluctuation around the mean Ni 
of Ni (iV' of IV') defined in (6). In Fig. 1, the indistin¬ 
guishable variances, an and induced by the intrinsic 
noises of two KE networks, are numerically confirmed. 

Besides the strict KE condition, a kinetic scheme may 
be selected merely because it is TE to the real systent^^. 
If two kinetic networks A and A' are TE to each other, the 
stationary states N'* and N® of their REs are related by 


N'® = UN® via some lumping matrix U. The stationary 
solution of the CME is the multinomial distribution. 


pi = 

N 




( 12 ) 


where iV® is the f-th component of N®—. Let P® be the 
vector whose N-th component is the P^ of A and P'® 

be the vector whose N'-th component is the P'^j of a 
TE system A' of A. One can show that P® and P'® are 
related by (see 83^^) 


P'® = UP® 


(13) 


irrespective of whether A' is KE to A or not. More pre¬ 
cisely, (13) is sufficient and necessary for the TE con¬ 
dition N'® = UN®, or is the TE condition on the level 
of stationary joint probability (see S3i^). While under 
the KE condition the contracted probability (t) must 
satisfy (9) at any t, under the TE condition it must only 
obey the form (12) at t = oo. 

An arbitrary network A does not always have a reduced 
KE system. However, it usually has infinitely many re¬ 
duced TE systems A'’s, which are TE to one another. An 
interesting indication from (7) and (13) is that if A and 
A' are TE, but not KE, to each other, their initially dis¬ 
tinguishable P and P' will become indistinguishable as 
t —>■ oo, irrespective of which U is used to contract A to 
A' (Fig. 2). Therefore, the lumpability between the prob¬ 
abilities of TE systems is similar to the Lyapunov func¬ 
tion for quantifying entropy production, where Kullback- 
Leibler divergence may be a proper lumpability measure. 


IV. LUMPING STOCHASTIC DIFFERENTIAL 
EQUATIONS 

Another frequently used approach for exploring fluctu¬ 
ations is the SDE, 

dN 

— =MN + f, (14) 

where N = N -1- JN is a real-valued random variable 
with the fluctuations JN about the ensemble mean N, 
which satisfies a deterministic equation as (1). Here, f 
is a Gaussian white noise with (f(t)) = 0 , = 

rS(t — t'), and (f(<')N^(t)) = 0 for t < t', where the co- 
variance matrix P is symmetric, positive semi-definite, 
and generally time-dependent. The solution of (14), 
N(t) = e^‘N(0) -I- /p — t) dr, is also a Gaussian 

random variable. The conditional covariance of JN is 
(T = (JNJN^) = Jp dr, which is symmet¬ 

ric and has the time derivative der/dt = Mcr-l-crM^-l-r. 
This equation is reduced to the fluctuation-dissipation 
theorem (FDT) when the system reaches equilibrium as 
t ^ oo, where dcr/dt vanishes^. If f represents an in¬ 
trinsic noise, cr will be the a in (11), when the system 
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is close to the thermodynamic limit. Together with the 
given M it uniquely determines F via the FDT. The F in 
the chemical Langevin equation in Ref.— belongs to this 
category. If f is an extrinsic noise, cr and F can be freely 
tuned as long as they comply with the FDT. 

Let A and A' be two network models approaching a 
real system, where A is described by (14) and A' satishes 

dN' 

— = M'N' + f'. (15) 

at 

Here N' = N' + 5N' and f' has statistical proper¬ 
ties analogous to f. HA and A! are KE to each 
other, they are connected by some U via N' = UN 
(notably not N' = UN). The covariance of the 
fluctuations of UN is UcrU"^ = (UdNdN'^U^) = 

ft UgMrp (gMr^T ^ jt ^M'rupijT ^ 

where the exchange relation implied by (4) has been used 
to obtain the last equality. Since UcrU'^ is indistinguish¬ 
able from cr, the distinguishiability between the covari¬ 
ances cr' and cr of two KE systems A' and A can be 
determined by the difference 

o-diff = ct' - UcrU^ = j e’^'’^Fdiff dr, (16) 

where Fdis = F' — UFU^ is a time-dependent symmetric 
matrix. While (16) tells us that Fdis = 0 implies crdia = 

0, its time derivative, dcTdis/dt = ‘Fdiff , im¬ 

plies the opposite, since * is an invertible matrix. 
Thus, crdifi = 0 if and only if 

Fdiff = 0, or equivalently F' = UFU^. (17) 

This relation was already known for U replaced by in¬ 
vertible transformations ((8.2.39) in Refi^), for which the 
argument is more straightforward than that for (17). 

Relation Fdiff = 0 in (17) is a weak condition, under 
which A and A' have only “statistically” indistinguish¬ 
able N and N'. A plausible stronger condition is 

f' = Uf, (18) 

which fulfills (17) and generates indistinguishable indi¬ 
vidual stochastic trajectories N and N'. Both (17) and 
(18) lead to indistinguishable covariances and variances 
of fluctuations of N and N'. Together with the indistin¬ 
guishable means of the KE condition, it yields the indis¬ 
tinguishable Gaussian distributions of N and N'. 

Although (17) shows that crdiff = 0 if and only if 
Fdiff = 0, it does not reveal whether two KE systems 
should have crdiff = 0 or not. For intrinsic noises, the 
indistinguishable covariances in (11) from the CME ap¬ 
proach lead to the expectation that crdiff = 0 in the SDE 
approach, because the SDE can describe CME fluctua¬ 
tions near the thermodynamic limit. According to (17), 
this expectation would be true if Fdiff = 0, which indeed 


can be proved (see F^- of ion channels below and S4i^). 
For extrinsic noises, F is not decided by N and distinct 
F’s will generate different fluctuations. Let Vdiff be a 
variance matrix whose diagonal terms are the same as 
those of cTdiff and zero elsewhere. For two KE systems 
A and A', (16) implies the simple ordering rule for the 
variances of their state fluctuations at any t: 

Fdiff > 0(< 0, = 0) ^ Vdiff > 0(< 0, = 0), (19) 

where > 0 (< 0) and = 0 stand for positive (negative) 
semi-definite and null matrices, respectively. 

In practice, which of (17), (18), and (19) is the cor¬ 
rect relation between two KE models A and A' of a real 
macromolecule depends on what we study. For intrinsic 
noises, the F and F' of A and A' can be analytically de¬ 
rived and must be related by (17). For extrinsic noises, 
if A and A' are to approach the same experimental data, 
their covariances should obey (18). If A and A' are to ap¬ 
proach two individually measured experimental data of 
the same macromolecule, their fluctuations may have di¬ 
verse orderings (19), because environmental noises in dif¬ 
ferent experiments are likely different. Yet, if the noises 
are statistically the same, the covariance relation is (17), 
as for intrinsic noises. 

Experimentally, fluctuations have been measured to 
predict the ion channels density, e.g., in nerve fibers 
of Rana pipiens^. To model this experiment with 
SDE (14), one considers a variety of channels, each 
of which can stochastically transit between n confor¬ 
mational states, with transition probabilities given by 
the rate constants in (1). According to the canonical 
theory^ or the linear noise approximation^^, the stochas¬ 
tic force f in (14) has the covariance F^- = X]fc=i(^fci-^fc + 
kikNi)Sij — {kijNi + kjiNj), where Ni is the probability 
of finding a channel in the z-th state and 6ij denotes the 
Kronecker delta. This covariance depends on the evo¬ 
lution of the mean value and thus varies with time. 
If the channel is modeled by a two-state (open/closed) 
system, the (1,1) entry of its equilibrium covariance^, 
rfi = ki 2 N^ + feiVI, complies with Onsager’s statisti¬ 
cal theory of equilibrium ensembles^. If the channel is 
modeled by two KE systems of different dimensions with 
the same form as F^, they fulfill Fdiff = 0 in (17) (see 
S4i^) and then crdiff = Vdis = 0. Therefore, the indistin- 
guishability crdiff = 0 from the Gaussian probability in 
the SDE approach coincides with the indistinguishability 
(11) from the joint probability in the GME approach. 

V. CONCLUSION 

Theoretically we generalized the lumping theory from 
deterministic dynamics to stochastic processes. It al¬ 
lows us to compare stochastic properties between hier¬ 
archical networks, such as networks of small systems, 
which are sensitive to external noises, or large networks 
whose species contain small number of copies. In ap¬ 
plications, we introduced lumping techniques from sys- 
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terns biology to molecular biology to explore the fluc¬ 
tuation relations of experimentally indistinguishable ki¬ 
netic schemes of biomolecules and the legitimacy of esti¬ 
mating macromolecular fluctuations by low-dimensional 
schemes. These findings are a kind of contractions be¬ 
yond the widely discussed ones based on “fast relax¬ 
ations” and are useful for extracting correct kinetic mod¬ 
els by observing extrinsic noise induced fluctuations. The 
analytical results derived from exact lumping here pro¬ 
vide limiting properties for networks connected by all 
kinds of approximate lumping conditions. They fur¬ 
ther give insights into more general fluctuation rela¬ 
tions in other contraction theories, which usually utilize 
similar block-triangular matrices to reduce systems^^, 
such as Keizer’s memoryless contractions^ and hierarchi¬ 
cal Volterra equations in the Zwanzig-Mori formalism^^. 
For further study, one may take into account more sub¬ 
tle issues, such as the approximate lumping for non- 
Markovian networksii and the deformation of hidden 
complexity of free energy surfaces^l. 
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FIG. 1. A system consists oi N = 10^ macromolecules, 
each of which can be regarded as a four-state system A 
with [fci3, fcsi, fei4, fe4i, fc23, fc32, fc24, fe42] = [ 8 , 6 , 2, 7, 9, 6 ,1, 5] 
and ki 2 = k 2 i = ^34 = fe 43 = 0 or a two-state system A' with 
[fci 2 ,fc 2 i] = [10,12], satisfying the KE condition (5). The 
inter-state transitions are stochastic and follow the proba¬ 
bilities assigned by the rate constants in RE (1). (a) The 
four erratic curves are the stochastic trajectories of the ratio, 
m = Ni/N, of the molecules in the i-th state, with i = 1,..., 4, 
calculated by this Markov chain to simulate the results of the 
CME. Averaging over 10“^ realizations, the intrinsic fluctua¬ 
tions tend to zero and the four erratic curves become four 
smooth curves, (b) The dynamics of ni -|- n 2 and 713 -|- nn 
recorded from a single realization of A are only roughly close 
to those of n'l and n '2 of its KE system A'. (c) Averaging over 
10 “^ realizations, ni -|- n 2 and ns -|- n 4 precisely approach n'l 
and n 2 , respectively. The coincidence in the mean dynamics 
leads to the coincidence in their variances, as shown by the 
four overlapped thick lines at the bottom. 


FIG. 2. A system consists of 10^ identical macro¬ 
molecules, each described by a three-state transition net¬ 
work A with the rate constants [fci 2 , ^ 21 , fea, fc 32 , fcsi, ^ 13 ] = 
[0.07,1,0.5,9,0.4,30]. Network A can be contracted into a 
two-dimensional TE network A' {A”) by merging states 1 and 

2 (1 and 3) of A into state T of A' [A”) and renaming state 

3 (2) of A as state 2' of A' {A”). The probability, p(a;„,t), 
of finding Xk macromolecules in the Ac-th state at time t is 
estimated by counting the frequency of that event when the 
system evolves 10^ times. Two initially distinct distributions 
p(Ni + N2,t) of ^ (blue) and p{N[,t) of A' (green), as well 
as p(Ni -I- Nsjt) of A (red) and p(N”,t) of A” (yellow), ap¬ 
proach each other as t —A 00 . This example demonstrates 
the increasing lumpability between the probabilities of two 
TE networks, as indicated by (7) and (13), in terms of the 
marginal probability p{xk, t). 






































